1 Introduction

COVID-19 is today considered a significant threat to the economy, social life, education, and workplace; in short, to human lives. This deadly virus is an RNA virus with a single strand and a thick envelope. The virus can spread through coughing droplets, direct contact, or touching contaminated surfaces with filthy hands [1]. Immigrants’ unwillingness to follow social distancing and quarantine rules can be one of the primary causes of COVID-19 transmission within communities [2].

Fig. 1
figure 1

The study area (District-wise map of Bangladesh) for the research

The virus was initially detected on a limited scale in November 2019. Furthermore, the following month, in December 2019, the first significant cluster was discovered in Wuhan, China [3, 4]. The World Health Organization (WHO) called the outbreak an international public health emergency on January 30, 2020 [5]. The WHO declared the epidemic pandemic on March 12 of the same year [6]. The virus has now infected about 1.19 million people in China, and about 5224 people have died. SARS-CoV-2 spread within and outside China, affecting people who had never touched animals means the virus can spread from one individual to another. According to [7], on June 7, 2022, 535,938,392 cases were identified worldwide. This infectious agent has caused 6,321,595 deaths. The first COVID-19 patient identified in Bangladesh was on March 8, 2020 [8]. The Center of Epidemiology, Disease Control and Research (IEDCR), Bangladesh’s premier national institute for surveillance, outbreak investigation, and research on existing, emerging, or undiscovered infectious illnesses, hired more than 80 personnel in June 2020 to increase COVID-19 surveillance and contact tracing. This scheme was done with funding and technical support from WHO [9]. As of June 7,2022, there were 1,953,700 cases identified and 29,131 deaths [10]. Due to cultural, political, socioeconomic, and environmental differences, it is more important than ever for people worldwide to work together to reduce the adverse effects [11].

Data science plays an increasingly essential role in finding solutions to societal and economic issues resulting from the exponential development in the current amount of data and the ongoing advancements in information technology [12]. Data science has a significant presence in business data mining [12], which enables real-time decision-making through the utilization of a mix of technologies that involve artificial intelligence (AI) and the internet of things (IoT) [13]. Various challenges have been described using data science methodologies, including crop harvesting, characterization of epidemiological outbreak patterns, commercial data mining, and e-commerce fraud [11,12,13,14,15,16]. Data science is also used in healthcare, especially since the COVID-19 pandemic started around the world at the beginning of 2020 [17]. In case to explain the transmission pattern of the COVID-19 outbreak in China, numerous data science tools have been used to analyze by undertaking retrospective and prospective investigations based on age-specific social contact-base transmission [16, 17]. The field of research known as epidemiology may be defined as the investigation of the incidence and spread of illnesses to determine the factors that cause them [18, 19].

A study using ARIMA models predicts COVID-19 will fill ICUs in Italy, France, and Spain. The number of instances will rise if the virus stays the same. Clinical and societal difficulties may be intractable, leading to catastrophe [20]. Cluster analysis was used by [14] to classify actual groups of COVID-19 datasets representing multiple states and union territories in India. This work aimed to enhance monitoring procedures and improve government policy. Previous research investigated demographic aspects important for COVID-19 transmission in Bangladesh using conventional statistical models [21], but no research has examined the spatial dependency of COVID-19 cases across Bangladesh’s 64 districts. The traditional statistical methods assume that the observations are independent and identically distributed. A cluster pattern violates classical assumptions of independence and homogeneity (stationarity) and renders classical methods inefficient or inappropriate.

Bayesian inference is used for disease mapping by estimating parameters based on actual data and prior assumptions [22]. It is spatial modeling, a helpful tool for investigating the relative risk of COVID-19 [23,24,25,26]. Hierarchical Bayesian methods are often used to model the data’s overdispersion and spatial correlation. Models with random effects, Poisson-Gamma and Poisson-Lognormal, are two classic solutions to the problem. These two models account for Poisson error-caused data overdispersion [27], called “uncorrelated heterogeneity” in disease mapping. Overdispersion can be caused by a spatially unstructured covariate, many zero counts, or many counts far from the mean. These models assume a gamma distribution or lognormal-distributed random effect to deal with spread-out data. Early disease mapping was dominated by them [28]. The Conditional Autoregressive (CAR) model investigates how data are related spatially [29]. Because it can use many weighting schemes, the model is widely used. It provides better solutions than unstructured alternatives. Convolution (COV) models combine unstructured random effects and structured random effects [30, 31]. Utilizing two independent sets of random effects, one of gamma and the other of normal, the combined model takes into consideration both overdispersion and clustering [32].

In this manuscript, we performed research to find the answer to the following research questions. Does the spatial patterning of the relative risk of COVID-19 give rise to the conclusion that the locations and shapes of geographic features are clustered? So, how do we measure the data’s spatial dependence and spatial heterogeneity? In addition, how do we capture these by the statistical models? If there is a cluster, do governments tend to compare their policies with those of district neighbours, or do they behave independently?

Different types of spatial autocorrelation measures, including Moran I and Geary C, are used to measure spatial dependency. The Bayesian hierarchical models via Gibbs sampling are employed to assess the heterogeneity of spatial data. Complete methods and materials regarding data source and Bayesian statistical models are presented in Sect. 2. The most critical variables in the sequence of events that lead to the high relative risk of COVID-19 in Bangladesh have been found using spatial autocorrelation and well-suited model results in Sect. 3. Finally, Sect. 4 highlighted that the study’s results could give government agencies useful information for taking actions that will reduce the prevalence of COVID-19.

2 Methods and Materials

2.1 Spatial Data Source and Description

The Directorate General of Health Service (DGHS) (https://dghs.gov.bd/) was the source of the information that was utilized in this article. From June 5, 2021 to May 14, 2022, data on the number of affected cases were collected from this source. Other two variables, namely annual growth rate per district and per district population, are obtained using data from the Bangladesh Bureau of Statistics (BBS)’s census of 2011, which was carried out in 2011 (https://tinyurl.com/3dcspsfp). The projected population of 2021 of all 64 districts has been taken from (https://tinyurl.com/3j4ff8r4). The growth rate of 2022 is then calculated for the projected population of 2021 using the geometric model. Calculation of growth rate uses the following formula :

$$\begin{aligned} r=\root n \of {P/P_0}-1 \end{aligned}$$

where, \(P_0\) is the population of 2021, P is the projected population of 2022, n is the number of intercensal period and r is the growth rate. Also, the formula of the prevalence rate is shown below:

$$\begin{aligned} \text{ Prevalence } \text{ Rate }=\frac{\text{ Affected } \text{ No. } \text{ of } \text{ Cases }}{\text{ Annual } \text{ Growth } \text{ Rate } \times \text{ District } \text{ Population }}\times 100,000 \end{aligned}$$

It is necessary to determine the prevalence rate for each district individually. After that, the results estimate how many individuals are affected out of a total population of 100,000 in each district.

2.2 Distribution of Response Variable

The Poisson distribution is a discrete probability distribution. This means that the value of the variable can only be a whole number, like 0, 1, 2, 3, etc. It can’t be a fraction or a decimal [33]. The probability mass function (pmf) is given below:

$$\begin{aligned} \Pr (Y=y)=\frac{e^{-\lambda }{\lambda }^y}{y!};\quad y=0,1,2,\ldots \end{aligned}$$

where, e is Euler’s number (\(e = 2.71828\ldots \)), y is the number of occurrences, y! is the factorial of y, \(\lambda \) is the rate of occurrence.

2.3 Spatial Autocorrelation

Getis [34] states that one of the essential parts of spatial analysis is the idea of spatial autocorrelation. The use of spatial autocorrelation helps to determine whether or not there is systematic spatial variation by instantaneously taking into account the feature districts and associated values [35]. This correlation introduces a divergence from the independent observation assumption that is used in conventional models [36]. The spatial autocorrelation method analyzes the spatial patterns of individual entities, determining whether they are clustered, dispersed, or random [37].

2.3.1 Moran I Autocorrelation

Moran’s I is a correlation coefficient that measures a data set’s overall spatial autocorrelation. In other words, it assesses how similar one object is to others around it. If items are attracted to each other, it indicates that the observations are not independent. Given a set of features and a related attribute, it determines whether the pattern expressed is clustered, dispersed, or random. It has a wonderful utility of comparing the value of a variable at any one point with other locations. Whether or if one site is autocorrelated with others. Moran’s I statistic is not limited to being less than one. Moran’s I values range roughly from +1 to −1, with an expected value of \(-1/(n-1)\) [38]. The formula for Moran’s I statistic which is similar to the Pearson’s coefficient [39] in Equation 1 is as follows:

$$\begin{aligned} I=\frac{n\sum _i{\sum _j{\omega _{ij}Z_iZ_j}}}{(n-1)\sum _i{\sum _j{\omega _{ij}}}} \end{aligned}$$
(1)

where n is the number of districts; \(\omega _{ij}\) indicates the quantification of the spatial weight between two districts i and j; z- scores are the transformation of the variables in which we are interested; in the numerator, products of two different z-scores in adjacent districts are being summed. Since the weights are row-standardized \(\sum \omega _{ij}=\) 1, the initial phase of the spatial autocorrelation study is to generate a spatial weight matrix including information on the neighborhood structure for each site. Adjacency is the administrative districts closely adjacent to the district, including the district itself. Administrative districts that are not adjacent to one another are given no weight [40].

Two districts with larger scores will come up with positive components in the numerator, contributing to a positive spatial autocorrelation. In contrast, negative spatial autocorrelation can be found if two districts emerge with lower scores [41]. However, the p-value is what determines whether or not the clustering is significant. If the absolute values of z-scores are high, then the clustering will be intense; however, the significance of the clustering will be determined by the p-value. p-values less than significance level (0.05) and very high z-values points that null hypothesis can not be accepted. This suggests that there is clustering [42]. On the other hand, high p-values and low z-values imply the null hypothesis’s acceptance. Consequently, Moran’s I value of \(-1\) shows perfect scattering, whereas zero suggests the spatial pattern of randomness, and \(+1\) proclaims a clustering pattern and perfect spatial autocorrelation.

The Global Moran’s I is an inferential statistic, which means that the investigation findings are always understood in the context of its null hypothesis. Let,

$$\begin{aligned} H_0:&\text{64 } \text{ districts } \text{ are } \text{ randomly } \text{ distributed } \text{ i.e. } \text{ there } \text{ is } \text{ no } \text{ spatial } \text{ dependency }\\&\text{ between } \text{ the } \text{ neighboring } \text{ districts } \text{(No } \text{ spatial } \text{ clustering } \text{ exists) };\\ H_1 :&\text{64 } \text{ districts } \text{ are } \text{ positively } \text{ autocorrelated } \text{(Spatial } \text{ clustering } \text{ exists). } \end{aligned}$$

2.3.2 Local Moran I

Various approaches to local spatial autocorrelation have been developed over the last few decades [43,44,45]. Local Moran’s I is one of the most well-known indicators that quantify the degree of similarity between two districts and their neighbors. Researchers compute the local Moran’s I to find clusters and geographic outliers locally [46, 47]. According to Anselin [46], spatial statistics can find autocorrelation of specified orders in the studied area. He develops LISA (Local Indicators of Spatial Autocorrelation), which indicates on a map, for each observation, how much similar values are clustered near that observation [44]. To calculate local Moran I, the formula is defined as follows

$$\begin{aligned} I_i=p_i\sum _j \omega _{ij}p_j \end{aligned}$$
(2)

where, \(p_i\) is the variation between i’s district relative risk and the mean; \(p_j\) is the weight of neighboring areas in the statistic, normalized for the number of neighbors.

2.3.3 Geary C

Geary’s C is a measure of spatial autocorrelation, which can also be thought of as an attempt to assess whether or not neighboring observations of the same occurrence are associated with one another [47]. The correlation in spatial autocorrelation is multi-dimensional and works in both directions, making it a more complicated concept than simple autocorrelation. The formula used for Geary C calculation, defined by [47] in this study, is expressed in Equation 3

$$\begin{aligned} C=\frac{(n-1)\sum _i{\sum _j{\omega _{ij}{(x_i-x_j)}^2}}}{2\omega \sum _i{{(x_i-\overline{x})}^2}} \end{aligned}$$
(3)

where, x is the variable of interest; \(\bar{x}\) is the mean of x; \(\omega \) is the sum of all \(\omega _{ij}\).

The value of C can fall within the range [0,2]. If the obtained statistic is \(0{\le }C{<}1\), then it is possible that there is a positive autocorrelation among the districts. \(C{\ge }1\) indicates of having little spatial autocorrelation. If \(1{\le }C{<}2\), then one can deduce that there is negative autocorrelation between the districts as a whole.

2.4 Spatial Regression Models

Spatial regression is a component of regression models that incorporate spatial position. The presence of a dependent relationship among a set of observations, known as spatial dependence, indicates that the model follows an autoregressive process [48, 49].

2.4.1 Poisson-Gamma Model

A negative binomial model can readily be used to model additional variation as an alternative to the Poisson model. Consider that a negative binomial distribution can be seen as a mixed model with gamma random-effects for each area which is alternatively known as the Poisson-gamma model [50]. This model assumes that the number of affected cases within each district is independent and follows a Poisson distribution with mean \(e_i\theta _i\) i.e., \(y_i\sim \text{ Poisson }(e_i\theta _i)\), with the assumption that

$$\begin{aligned} \lambda _i=e_i\theta _i\quad ; \quad i=1,2,3,\ldots ,64 \end{aligned}$$

is constant within each district. The parameter of interest in the model is the relative risk (\(\theta _i\)), and to account for unobserved heterogeneity, it is assumed that \(\theta _i\) follows a gamma prior distribution with parameters a and b , and when combined with a Poisson likelihood, gives a gamma posterior. Then, the relative risk has a gamma posterior, that is

$$\begin{aligned} \theta _i \sim \text{ Gamma }(a+y_i,b+e_i). \end{aligned}$$

The Poisson-Gamma model assumes that the observations are independent. When most spatial data are correlated, it does not take into account the spatial correlation between risk in nearby areas; it does not also allow an easy adjustment for spatial covariates. For this reason, PLN, CAR, and Convolution models were considered.

2.4.2 Poisson-Lognormal Model

The Poisson-lognormal (PLN) model is an alternative that can be considered in place of the Poisson-gamma model. It connects the relative risk, denoted by \(\theta _i\), to a linear predictor that includes a normally distributed random effects component, denoted by \(v_i\) [50]. The log-normal model for the relative risk is defined as:

$$\begin{aligned} y_i\sim \text{ Poisson }(e_i\theta _i) \end{aligned}$$

with

$$\begin{aligned} \log (\theta _i)=\alpha +v_i; \quad i=1,2,3,\ldots ,64 \end{aligned}$$

where, \(v_i\sim \text{ N }(0,\sigma _v^2)\), is the district-specific random effects, capturing extra Poisson variability in the log-relative risk of COVID-19 in area i, \(i=1,2,\ldots ,64\) and \(\alpha \) is the overall level of the relative risk. In the Poisson-Gamma model, we consider \(\theta _i\sim \text {Gamma}(a,b)\) whereas we consider \(e^{v_i}\sim \text {Lognormal}(0,\sigma ^2_v)\) with precision \(\tau ^2_v=1/\sigma ^2_v\) where, \(\sigma _v^2\) follows gamma prior distribution.

2.4.3 Conditional Autoregressive Model

In this model, the district-specific random effect component takes into account the effects that vary in a structured manner in space, i.e., the correlated heterogeneity. The model was introduced by [28] in an empirical Bayes setting and developed by [29] in a fully Bayes implementation. The model is defined as follows:

$$\begin{aligned} y_i\sim \text{ Poisson }(e_i\theta _i) \end{aligned}$$

with

$$\begin{aligned} \log (\theta _i)=\alpha +u_i; \quad i=1,2,3,\ldots ,64 \end{aligned}$$

where, \(\alpha \) is an overall level of the relative risk, correlated heterogeneity denotes by \(u_i\), which means the values of the district-specific random effects, \(u_j\) in "neighboring areas". The model uses a spatial correlation structure to estimate the risk in any area which depends on neighbouring areas [50]. It is presumable that the correlated heterogeneity terms will behave in accordance with an intrinsic CAR model, such as the one presented by [51], the random impact caused by the CAR follows a normal distribution, and its mean and variance are weighted following the averages and variances of the adjacent areas i.e.

$$\begin{aligned} \begin{array}{rcl} {[}u_i&{}|&{}u_j,i\ne j,{{\tau }_u}^2]\sim N({\overline{u}}_i,{{\sigma }_i}^2)\\ {\overline{u}}_i&{}=\frac{1}{\sum _j{{\omega }_{ij}}}\sum _j{u_j{\omega }_{ij}{{\sigma }_i}^2=}\frac{{{\sigma }_u}^2}{\sum _j{{\omega }_{ij}}}&{} \end{array} \end{aligned}$$

where, \(u_i\) is smoothed towards the mean rate in the set of neighbouring areas; mean \(\overline{u}_i\) which means it is the average of the spatial random effects of these neighbors and variance parameter \(\sigma _u^2\) with precision \(\tau _u^2=1/\sigma _u^2\). Here, \(\sigma _u^2\) follows gamma prior distribution.

2.4.4 Convolution Model

Convolution models do not, however just include a random effect to correct for overdispersion; rather, they also include a random-effects term that controls for spatial autocorrelation. This is because convolution models take into consideration both [50]. In this model, district-specific random effects are decomposed into a component that takes into account the effects that varies in a structured manner in space, i.e., the correlated heterogeneity defined by \(u_i\) and a component \(v_i\) that models the effects that vary in an unstructured way between areas i.e., the uncorrelated heterogeneity. Like the CAR model, this model was equally introduced by Clayton and Kaldor [28] in an empirical Bayes setting and developed by Besag et al. [29] in a fully Bayes implementation. The model is defined as:

$$\begin{aligned} \begin{array}{rll} y_i\sim &{}&{}\text{ Poisson }(e_i\theta _i)\\ \log (\theta _i)=&{}\alpha +u_i+v_i &{}\quad ; \quad i=1,2,3,\ldots ,64 \end{array} \end{aligned}$$

The model uses a spatial correlation structure to estimate the risk in any area which depends on neighbouring areas. This is assumed to be normally distributed i.e.

$$\begin{aligned} \begin{array}{rlll} {[}&{}u_i&{}|&{}u_j,i\ne j,{{\tau }_u}^2]\sim N({\overline{u}}_i,{{\sigma }_i}^2)\\ {\overline{u}}_i&{}=\frac{1}{\sum _j{{\omega }_{ij}}}\sum _j u_j{\omega }_{ij},&{}{{\sigma }_i}^2=\frac{{{\sigma }_u}^2}{\sum _j{{\omega }_{ij}}}.&{} \end{array} \end{aligned}$$

2.4.5 Modified CAR Model

The Poisson model, in particular, is convenient and sophisticated from a mathematical perspective, but the extension is required due to the model’s restrictive nature. Firstly, the model does not accurately describe data variation, and secondly, hierarchies are frequently accounted for by including random effects that are assumed to be normally distributed. The modified CAR model takes into account both overdispersion and clustering by employing two distinct sets of random effects, one of gamma and the other of normal [32]. This model is also named as “Combined model” explained in Neyens et al. [50]. This is what the model is defined to be :

$$\begin{aligned} y_i\sim & {} \text{ Poisson }(e_i\theta _i)\\ \log (\theta _i)= & {} \log (g_i)+\alpha +u_i \end{aligned}$$

where \(g_i\) terms, which are assumed to follow a gamma distribution, are used to model uncorrelated heterogeneity

$$\begin{aligned} g_i\sim \text{ Gamma }(a,b) \end{aligned}$$

whereas the modeling of correlated heterogeneity is accomplished through the accumulation of CAR random effects \(u_i\).

$$\begin{aligned} \begin{array}{rlll} {[}&{}\qquad u_i&{}|&{}u_j,i\ne j,{{\tau }_u}^2]\sim N({\overline{u}}_i,{{\sigma }_i}^2)\\ {\overline{u}}_i&{}=\frac{1}{\sum _j{{\omega }_{ij}}}\sum _j u_j{\omega }_{ij},&{}{{\sigma }_i}^2=\frac{{{\sigma }_u}^2}{\sum _j{{\omega }_{ij}}}.&{} \end{array} \end{aligned}$$

In contrast to the convolution model, the modified CAR model models uncorrelated heterogeneity with a gamma distribution instead of a lognormal distribution [50]. The research conducted by [32] demonstrates that the gamma distribution can accurately model extra-variance. They provide a more detailed theory for multiple data types, which is useful for the combined model.

When overdispersion random effects are present alongside normal ones, [32] modified conjugacy to account for them. The goal of this property is to ensure that strong conjugacy holds even in the presence of random effects that follow a normal distribution. That is to say; we will only take conjugacy into account if the random effect ui follows a normal distribution. Hence, the Poisson and gamma distributions are conjugate. The posterior distribution is defined as

$$\begin{aligned} \theta _i|u_i,y_i \sim \text{ Gamma }(a+y_i,b+e_i \kappa _i) \end{aligned}$$

with \(\kappa _i = exp(\alpha +u_i)\). As a result, the conditional mean of \(\theta _i\) is \((a + y_i)/(b + e_i \kappa _i)\), and this can be rewritten as a weighted average of the prior mean, which is a/b.

2.5 Deviance Information Criterion

For the purpose of model comparison, the deviance information criterion (DIC) and a related measure, \(p_D\), which counts the number of model parameters that are most important [52]. How to define the effective number of parameters in a Bayesian framework, particularly for complicated models, is a crucial subject. A DIC difference greater than 10 eliminates the model with the higher DIC, while a DIC difference less than 5 does not indicate a statistically significant result. Since DIC depends on MCMC output, it’s sensitive to sampling fluctuations [53].

To demonstrate that DIC is additive using models and priors that are independent of one another, let the vector of parameters be \(\theta \) associated with y. And \(f(y|\theta )\) and f(y) denote the conditional and marginal distributions of y. Then,

$$\begin{aligned} DIC=\overline{D}+p_D \end{aligned}$$

where, D is the posterior expected value of the deviance function, posterior deviance is defined as :

$$\begin{aligned} p_D= & {} \overline{D}-D(\bar{\theta })\\ \ \ \overline{\theta }= & {} E\left[ \theta |y\right] \\ \overline{D}= & {} E\left[ D\left( \theta \right) |y\right] \end{aligned}$$

are the posterior means of \(\theta \) and the Bayesian deviance

$$\begin{aligned} D\left( \theta \right) =-2{\textrm{ln} \left\{ f\left( y|\theta \right) \right\} \ }+2{\textrm{ln} \left\{ f\left( y\right) \right\} \ } \end{aligned}$$
(4)

Suppose, y and \(\theta \) be partitioned as \((y_1,\ldots ,y_k)\) for K collision categories and \((\theta _1,\ldots ,\theta _k)\). Defining \(DIC_k={\overline{D}}_k+p_k\), \(p_k={\overline{D}}_k-{\overline{D}}_k\left( {\overline{\theta }}_k\right) \), \({\overline{D}}_k=E\left[ D_k\left( {\theta }_k\right) |y_k\right] \), \({\overline{\theta }}_k=E[{\theta }_k|y_k]\), \(D_k\left( {\theta }_k\right) =-2{\textrm{ln} \left\{ f\left( y_k|{\theta }_k\right) \right\} \ }+2{\textrm{ln} \left\{ f\left( y_k\right) \right\} \ }\).

Under priors and independent models, it is found \(f\left( y|\theta \right) =\prod ^k_{k=1}{f(y_k|{\theta }_k)}\) and \(f\left( y\right) =\prod ^k_{k=1}{f(y_k)}\). These multiplicative conditional and marginal distributions of y contribute additively to the Bayesian deviation Equation 4, resulting in y’s extreme value \(DIC= \sum ^K_{k=1}{DIC_k}\) [51]. A small \(\overline{D}\) corresponds to a well-fitted model. If DIC differences were borderline, less complex models with lower \(p_D\) were used [50].

2.6 Computational Procedure

RStudio version 4.2.0 uses moran.test and geary.test (available in spdep package) to measure spatial autocorrelation. Before computing these two statistics, poly2nb compiles a list of districts that share adjacent boundaries. nb2listw adds weights to a neighbor’s list. By using moran.test p-value is calculated analytically, not by MC. This isn’t always significant. A function moran.mc can test significance using MC simulation. Local Moran I provides I value, variance, p-value, predicted I, and variation for each district using localmoran function.

RStudio’s maptools package was used to visualize affected cases. For reading shape files readShapePoly function is used. Two functions moran.test and geary.test are used to measure spatial autocorrelation. Before computing two statistics, poly2nb function compiles a list of districts that are neighbors based on their adjacent boundaries, meaning they share one or more boundary points. nb2listw function adds spatial weights to an existing neighbors list. p-value of moran.test is calculated analytically, not by MC. This doesn’t always indicate importance. moran.mc can test significance using MC simulation. Using localmoran function, local Moran I provides its own I value, variance, p-value, predicted I, and variation of I for each district. In this instance, GeoDa with version 1.20.0.10 is put to use in order to track down significant areas of relative risk via a LISA cluster map that employs 999 simulations at a significance level of 10%.

WinBUGS, a statistical software for Bayesian analysis using Markov Chain Monte Carlo (MCMC), is used to perform Bayesian models and spatial data analysis. This software is based on the BUGS (Bayesian inference Using Gibbs Sampling). and it also offers a goodness-of-fit measure called the deviance information criteria, which can be used to compare models [54]. For each model, two separate chains starting from different arbitrary initial values were used to calculate the realized value of posterior estimators in the Bayesian hierarchical model. The dynamic trace plots were used to check the good mixing of two chains with 100000 iterations taken in which 20000 were excluded as a burn-in sample using WinBUGS. In case to improve convergence and reduce the effect of autocorrelation, thin values of 5 were used for testing the convergence of the estimator in spatial modeling.

3 Data Analysis

From 2021 to 2022, Dhaka district, Bangladesh’s capital city, had the highest number of cases with 498,171 (see, Fig. 2). The districts surrounding Dhaka have fewer reported incidences. Maintaining adequate safety precautions in a small, densely populated city is impossible. Two districts, Chattogram ranks second and Khulna third for infected cases, correspondingly. Ports, business advantages, improved communication, education, and other amenities drive Chattogram’s population growth. As it is the second-largest city, the number of affected cases is also higher. The least number of COVID-19 cases are in Lalmonirhat. Fewer people are affected when there are fewer people in an area. Few people got sick with a virus in the hilly parts of Bandarban. Mountain dwellers can tolerate low oxygen levels and have a virus-free environment, say, researchers. Dry mountain air, high levels of UV radiation, and low barometric pressure combine to create an inhospitable habitat; these conditions, taken together, lower the survival rate of airborne viruses. Those who live in the mountains may benefit from it [54].

Figure 2 demonstrates that Dhaka district has the highest prevalence rate. COVID-19 prevalence varies within districts. Rajshahi district, on the west side of the country, has a lower prevalence rate than Dhaka. The disease is highly prevalent in Khulna district, which is also situated in the south. Although Khulna has a population five times smaller than that of Chattogram, Chattogram has more people affected by the COVID-19 virus than Khulna. Due to the population of Khulna being theoretically disproportionate to relative risk, the calculated relative risk for this city is greater than that of Chattogram. In Faridpur, Gopalganj, and Rajbari districts, prevalence rates are lower than in Dhaka. As the virus evolved and underwent mutations, an increasing number of people contracted the disease and perished. Dhaka has the most people affected by the outbreak, and its dense population makes it vulnerable. Lower prevalence rates have been observed in Bangladesh’s northern (Sunamganj) and northeastern (Habiganj) districts. The northern district of Gaibandha has the lowest prevalence.

Fig. 2
figure 2

a District-wise affected number of cases; and b District-wise prevalence rate

The result for Moran I of COVID-19 relative risk is 0.0846, which indicates positive spatial autocorrelation between districts (Table 1). The obtained p-value of 0.0111, less than 0.05, and the corresponding z-value of 2.54 also show that the null hypothesis should be rejected. Using MC simulation of 599 global Moran I depicts the same p-value of 0.0111 at a significance level of 5%. In both cases, the null hypothesis should not be accepted. A further demonstration of how likely the observed test statistic is is provided by a density plot (Fig. 3) of the Monte Carlo permutation outcomes. Moreover, the obtained value of the Geary C statistic is 0.8786, which falls within the interval [0,1), indicating the existence of positive autocorrelation between districts. Both of the spatial autocorrelation analysis procedures indicate the existence of clusters between the district of COVID-19 relative risk.

Table 1 Moran I & Geary C Statistic are calculated under randomization
Fig. 3
figure 3

Density Plot of Global Moran I

Taking into account the effect of spatial lag and the spatial weights of the districts next to each other, the LISA cluster map in Fig. 4. shows the important districts with weighted spatial homogeneity at a 90% confidence level. This popular choropleth map sorts places with a significant local Moran statistic value from Equation 2 by type of spatial correlation. A bright red color indicates a spatial cluster that is High-High, while a bright blue color indicates a spatial cluster that is Low-Low. A light blue color indicates a spatial outlier that is Low-High, while a light red color indicates a spatial outlier that is High-Low. Dhaka, Munshiganj, Narayanganj, and Faridpur are the four districts that are shown to form a statistically significant High-High spatial cluster. It shows that the relative risk of COVID-19 is high in these districts, and it also shows that the relative risk is high in the adjacent districts. The districts that meet the criteria for statistical significance and are located in the Low-Low spatial cluster are as follows: Nilphamari, Lalmonirhat, Rangpur, Kurigram, Gaibandha, Dinajpur, Bogura, Joypurhat, Jamalpur, Sherpur, Mymensingh, Netrokona, Sunamganj, Kishoreganj, Habiganj, and Sylhet. It would appear that there is a low relative risk of COVID-19 in these districts, and it would also appear that there is a low relative risk in the districts adjacent to them. Even though Chattogram has a population five times larger than Khulna, Khulna has a lower number of people whom COVID-19 has impacted than Chattogram. The calculated relative risk for Khulna is more significant than that of Chattogram because the population in Khulna is theoretically disproportionate to relative risk. In addition, the districts of Tangail, Gazipur, Manikganj, Khagrachari, Bandarban, Madaripur, and Satkhira are categorized as Low-High spatial outliers. These seven districts give off the impression of having a low relative risk, whereas the districts that surround them typically show a high relative risk.

Fig. 4
figure 4

Spatial Clustering (local Moran’s I) of COVID-19 Relative Risk

Table 2 displays the summary statistics of posterior estimators including the 95% credible intervals. Whereas, a credible interval implies that the true parameter would lie within the lower limit and upper limit and we can be 95% confident about that. Figures 6, 9, 12, 15, and 18 (in “Appendix”) that the data has a normal distribution for the overall mean. On the other hand, the presence of variance and precision is suggestive of a chi-square distribution.

Table 2 Summary statistics of Poisson-Gamma, Poisson-Lognormal, CAR, convolution and modified CAR models

According to [55] autocorrelation plot in any figure can “indicate dimensions of the posterior distribution that are mixing slowly, where slow mixing is often associated with high posterior correlations between parameters”. Figures 7, 13 (in “Appendix”) demonstrates estimators are mixing well and autocorrelation is rapidly disappearing before each case is considered. Hence, no autocorrelation is present here. In contrast, Figs. 10, 16, 19 (in “Appendix”) exhibits poor mixing for \(\alpha \) and that autocorrelation is not significantly decreasing before each case is evaluated. Therefore, substantial autocorrelation exists for \(\alpha \) in this case.

The two main features desired in the trace plots are stationarity and well mixing. For the path to be considered stationary, it must remain inside the posterior distribution.To be more explicit, all of the traces congregate around an extremely consistent central trend. Figures 5, 8, 11, 14, and 17 (in “Appendix”) show stable stationarity. The second characteristic of a chain is called “good mixing” which means that each sample in each parameter is not strongly related to the sample that came before it. As the trace moves across the posterior distribution without getting tangled in any one place, each path can be seen to move in a zigzag pattern. The second trait is evident in experimental trace plots. Red and blue chains used both features.

Table 2 shows DIC values for four different models, two of which are non-spatial and two of which are spatial. Rules say that the model with the lowest DIC value provides a superior fit. The modified CAR model has the lowest DIC value compared to the other models, which are almost identical. Even the \(\overline{D}\) and \(p_D\) values are of a relatively low magnitude. So, the modified CAR model significantly outperformed most other models.

4 Discussion and Conclusions

This research aims to ascertain the degree to which COVID-19 cases differ in their spatial distribution across 64 districts. Four Bayesian hierarchical models, both spatial and non-spatial, were used to verify the heterogeneity of the spatial data. Spatial models help explain geographical differences. These models show that the model’s fit is not the same everywhere. The examination of spatial autocorrelation performed at the district level gives data regarding districts’ embeddedness and spatial dependency, which conveys that the districts are significantly clustered. Since the number of affected cases is proportional to the relative risk, it is reasonable to expect that there will be fewer affected cases if the relative risk is low. Dhaka was found to have the highest relative risk compared to other districts in Bangladesh. Additionally, there is evidence that Khulna has a high risk, but one that is lower than that of capital. The results of this study show that the risk is also higher in districts with many people, like Chattogram. Overpopulation is a cause of a higher risk, along with fast transmission, lack of safety, and not taking precautions.

This research looks at the overall situation of COVID-19 in each district. The government should spread information and make safety materials available. Keep the cost of preventive measures at a reasonable level. A vaccination campaign must be started to make antibodies in the body. If the disease can be stopped from spreading in areas with many people, then the number of people who are adversely affected can also be reduced.

In this research, we focused on one response variable without considering other covariates except for the district population’s density. The results will be more generalized if we use other related risk factors in the model. Further research will be performed using a Bayesian hierarchical spatial model with other related covariates. The results of this investigation will have led to the discovery of further significant information.